Goto

Collaborating Authors

 integrated gradient


eae3af0f5868f0a2eceb74208966d55b-Paper-Conference.pdf

Neural Information Processing Systems

Modern LLMs are increasingly deep, and depth correlates with performance, albeit with diminishing returns. However, do these models use their depth efficiently? Do they compose more features to create higher-order computations that are impossible in shallow models, or do they merely spread the same kinds of computation out over more layers? To address these questions, we analyze the residual stream of the Llama 3.1, Qwen 3, and OLMo 2 family of models. We find: First, comparing the output of the sublayers to the residual stream reveals that layers in the second half contribute much less than those in the first half, with a clear phase transition between the two halves.



Path-Sampled Integrated Gradients

arXiv.org Machine Learning

We introduce path-sampled integrated gradients (PS-IG), a framework that generalizes feature attribution by computing the expected value over baselines sampled along the linear interpolation path. We prove that PS-IG is mathematically equivalent to path-weighted integrated gradients, provided the weighting function matches the cumulative distribution function of the sampling density. This equivalence allows the stochastic expectation to be evaluated via a deterministic Riemann sum, improving the error convergence rate from $O(m^{-1/2})$ to $O(m^{-1})$ for smooth models. Furthermore, we demonstrate analytically that PS-IG functions as a variance-reducing filter against gradient noise - strictly lowering attribution variance by a factor of 1/3 under uniform sampling - while preserving key axiomatic properties such as linearity and implementation invariance.


A Attribution methods for Concepts

Neural Information Processing Systems

In our case, it boils down to: ' The smoothing effect induced by the average helps to reduce the visual noise, and hence improves the explanations. For the experiment, m and are the same as SmoothGrad. We start by deriving the closed form of Saliency (SA) and naturally Gradient-Input (GI): ' The case of V arGrad is specific, as the gradient of a linear system being constant, its variance is null. W We recall that for Gradient Input, Integrated Gradients, Occlusion, ' It was quickly realized that they unified properties of various domains such as graph theory, linear algebra or geometry. Later, in the '60s, a connection was made At each step, the insertion metric selects the concepts of maximum score given a cardinality constraint.